CaU. VW. 40L 477-480, Mi/ch 1905. CopyV* C IMS by MTT 

Molecular Organization Minireview 
of the AIDS Retrovirus 



A. 6. Rabson and M. A. Martin 
National Institute of Allergy 
and Infectious Diseases 
National Institutes of Health 
Bethesda. Maryland 20205 



Since their original clinical description in 1981, the Ac- 
quired Immunodeficiency Syndrome (AJOS) and the 
AIDS-Related Complex (ARC) have become extremely 
important public health problems throughout the worfd. 
By the end of 19&4. approximately 8,000 cases of AIOS 
had been diagnosed in the USA with an overall mortality 
of over 40%; more than 100 new cases are reported each 
week. A major advance in the understanding of the patho- 
genesis of this disease has been the isolation of a novel 
retrovirus from AIDS and ARC patients. Serologic studies 
have demonstrated the association of viral infection and 
clinical disease. During its propagation in vitro, the AIDS 
retrovirus (AIDS RV) preferentially infects and kills human 
T lymphocytes of the OKT4/Uu-3 subset, the same cells 
apparently destroyed in patients with the disease. Both 
seroepiflemiotogicaJ and in vitro data strongly suggest that 
the retrovirus isolated from patients is in fact the causative 
agent of AIDS. This virus has been variously called lymph- 
adenopathy virus (LAV), Human T Lymphotropic Virus III 
(HTLV-III) and AIDS-Associated Retrovirus (ARV). We 
shall refer to it as the AIDS RV. With the recent publication 
of the complete nucleotide sequences of molecular clones 
corresponding to the three isolates, it rs now possible to 
identify different viral genes and make assessments re- 
garding their expression and structural variability. 

The DNA sequence of the AIDS RV has been derived 
from molecular clones of both unintegrated and integrated 
proviruses. Wain-Hobson et al. (Cell 40. 9-17. 1985) se- 
quenced an integrated LAV provirus cloned al Hindlll sites 
preseni in (he R region of both long terminal repeals 
(LTRs). Ratner et al. (Nature 3J3 ( 277-284, 1985) derived 
the sequences of &9 kb ol one cloned unintegrated HTLV- 
III provirus, as well as portions of two additional unin- 
tegrated proviruses and one integrated provirus. Sanchez- 
Pescador et al. (Science 227, 484-492, 1985) sequenced 
various regions of four Integrated ARV-2 clones and one 
unintegrated ARV-2 circular provirus. 
The sequences of LAV, HTLV-III. and ARV-2 show 



general agreement In the site and organization of the 
AIDS RV genome. The AIDS RV is the longest retrovirus 
sequenced to date. The ONA provirus with two LTRs is 
9734-9749 bp In length, the larger size being observed in 
a clone of HTLV-III containing a 15 bp duplication in the 
em gene. As shown in Rg. 1, the AIDS RV coniains many 
characteristic features of replication competent retrovi- 
ruses: LTRs. group specific antigen genes (gag), a gene 
region {pot) encoding reverse transcriptase as well as 
putative endopeptldase and integrase enzymes, and a 
gene encoding the virus envelope glycoprotein (env)- Al- 
though all replication competent viruses contain these 
genes, each retroviral family exhibits unique structural 
features and its own program of gene expression. The 
AIDS RV is no exception. In addition to having an un- 
usually long overlap of the gag and pot genes and no over- 
lap of pot and env (in contrast to other retroviruses), the 
AIDS RV contains two novel open reading frames (Fig. 1. 
segments A and B) that may play a rote in its unusual 
cytopathogenicity. 

Unique Features of the AIDS Provhal DNA 
LTR: The 634 bp AIDS RV LTR Is simitar in size to other 
type C mammalian retroviral LTRs. Unlike HTLV-I, which 
has an LTR containing R and U5 regions of 228 and 176 
bp. respectively, the sizes of R and U5 of the AIDS RV LTR 
are 97-98 and 83-84 bp, respectively. One of the major 
surprises of the AIDS provirus sequence is the presence 
of a tRNA primer binding site (pbs) complementary (18/18 
match) to (RNA*». All other infectious mammalian 
retroviruses have a tRNA 1 " pbs, except mouse mam- 
mary tumor virus (MMTV), which also coniains the 
tRNAl* pbs. As is true of other retroviruses excepl HTLV-I, 
HTLSAIl and BLV, the AIDS RV LTR contains a polyadeny- 
lation signal (AATAAA) within R t 19 bp 5* to the R-U5 
boundary, and a TATA sequence 22-27 bp 5' to the pre- 
sumed mRNA start site. No CAAT sequence, a feature 
common to several LTRs, could be identified in its usual 
position 60-80 bp 5' to the mRNA cap site. 

gag Gene; Retroviral gag proteins are synthesized in 
the form of a polyprotein precursor which is proteolytics! ly 
processed to individual gag proteins. The AIDS RV gag re- 
gion (Rg. 1) is approximately 1500 bp long and could 
therefore encode a polyprotein of about 500 amino acids. 
This size is consistent with the 53-55 kd protein detected 
immunochemical^ in AIDS RV infected cells using anii- 



93g 



LTR 



pol 



ssf 223 - 



env 



LTR 



0 1 2 3 4 5 6 7 8 

i 

Figure 1. Organization 01 the AJOS Retrovirus 

The gene sw and arrangement are de<wed "om me (n/ee puW*ned Ona sequences of ine aids RV (see text). 



— 1 Kb 
10 



NAR 19 '9? Ii:i0 



33 1 40 61 30 17 



PAGE 



ColJ 

47a 



Dody present in patient sera. In addition. p24/p25 gag and 
pl7 gag proteins have been identified by immunobiotiing 
or immunoprecipitaiion procedures. All three groups that 
sequenced the viral genome determined the partial amino 
acid sequence ol one or more putative gag proteins- the 
latter were then aligned with the deduced amino acid se- 
quence of the AIDS RV gag gena Unlike the gag regions 
of infectious murine, feline, and simian retroviruses -all 0/ 
which encode four gag proteins-the AIDS RVgag gene 
specifies only three proteins. HTLV-I and HTLVH also 
have truncated gag genes that encode three proteins. 
Based on amino acid composition and amino acid se- 
quence alignment with the gag regions of other se- 
quenced retroviruses. It was concluded that the analogue 
of the second (of the four) gag proteins was absent from 
the AIDS RV genome, 

pot Gene; The 00/ region of the AIOS RV is in a different 
reading frame from gag. It overlaps the 3' end of the gag 
gene by 60 amino acids, and is otherwise similar in orga- 
nizatton to pot genes present in other mammalian 
retroviruses. Alignment of AIDS RV po! nucleotide and 
deduced amino acid sequences with corresponding 5' 
endopept.dase, reverse transcriptase, and the 3' en- 
donuclease/integrase domains of other viruses is readily 
accomplished. 

. env Gene: The env region of retroviruses generally en- 
codes a single polyprotein precursor which is cleaved to 
generate a larger external viral envelope glycoprotein (en- 
coded by sequences located in the 5' portion of the any 
gene) thai is attached by disulfide bonds to a smaller 
transmembrane protein (encoded by 3' env sequences). 
The AIDS RV possesses a very large env region capable 
of encoring an 663 amino acid precursor with an esti- 
mated unglycosylated molecular weight of 90-100 kd An- 
other unusual feature of the AIDS RV env gene sequence 
a the large number (30-32) of potential glycosylate 
sites, me majority of which (24-26) are situated upstream 
from the precursor cleavage site and would, therefore, be 
located in the external env protein. As a consequence the 
molecular weights ol the glycosylated env precursor and 
external env proteins coufd both be considerably larger 
than the peptides deduced from the nucleotide sequence. 
In fact, immunoprecipitation and Western blot analysis of 
AIDS RV infected cell fysates have revealed the existence 
of 160 and 120 kd glycoproteins. 

Potential processing sites for the cleavage of the env 
precursor glycoprotein would generate an unusually large 
iransmembrane protein containing approximately 350 
ammo acids. The three groups disagree about the possi- 
ble function ol this processed env protein. Rainer et al. re- 
tor to the entire AIOS RV env gene as the 'env/of region 
and argue that the 3' env region encodes a (^acting 
analogue of the pX-IV segment of HTLV-I. Evidence sup. 
porting the existence of rra/rs-activation of LTfl mediated 
transcripiion in HTLV-lli in/ected cells seems to be quite 
convincing (Sodroski el al.. Science 221 171-173. 1965). 
However, the lack of any polynucleotide sequence homol- 
ogy between me tor segment of HTLVI and HTiVlll 
provrral DNA precludes the mapping of its functional 
equjvaleni at the present lime. Rainer et al. use hydrophi- 



liciiy/hydrophobicity calculations 10 argue thai the appro* 
Imaifliy 350 amino acid env transmembrane protein ol 
HTLV-III is the functional equivalent of the separate 178 
ammo add transmembrane and the 357 amino acid tor 
proteins encoded by the HTLV-I genome. Wain-Hobson et 
al. and Sanchez-Pescador et al. suggest no unusual role 
for the farge AIDS RV env transmembrane protein. 

An intriguing and potentially important difference be- 
tween the AIDS RV and other retroviruses is the presence 
of two additional reading frames denoted 'A' and in 
Fig. 1. *A- is an open reading frame 196-203 amino acids 
long, situated between pot and env. A stretch of 580 bp of 
noncoding sequence begins at the 3* end of "A." In previ- 
ousiy published sequences of other replication competent 
retroviruses, the pot and env reading frames overlap with- 
out intervening open reading frames or noncoding re- 
gions. It is possible that the noncoding sequences in 
this region of the AIDS RV have regulatory functions. A 
second open reading frame f B- in Fig. 7) unique to the 
AIDS RV overlaps the 3' end of env and extends for ap- 
proximately 200 amino acids, terminating about 330 
nucleotides into the U3 region of the 3' LTR. This reading 
frame is open in LAV. ARV-2, and some clones of HTUAIII- 
however, it appears to contain a termination codon in at 
least one done ol HTLV-lll. The m B" open reading frame 
of LAV and AflV-2 is similar to the pX-IV open reading frame 
of HTLV-I, which extends 76 nucleotides into the U3 region 
of the 3* LTR (Seikietal.. PNAS 60. 3616-3622. 1983). Fur- 
ihermore, a 1.7 kb polyadenylated mRNA containing LTR 
and -8" region sequences has been identified in AIDS RV 
infected cells (Rabson el al., in preparation). 
Classification oi the AIDS Retrovirus 
Retroviruses have been traditionally classified on the ba- 
s»s of their biology, electron microscopic morphology, and 
genomic structure. 

Biological classifications have divided retroviruses into 
three groups: 1) the oncoviruses, many of whose mem- 
bers are naturally oncogenic, producing leukemias, lym- 
phomas, and breast carcinomas; 2) the spuma viruses or 
foamy viruses, which produce vacuolization of lissue cul- 
ture cells but no known disease; and 3) the (antiviruses or 
stow viruses, which produce cytopathic effect in tissue cul- 
ture cells and slowly progressive disease in animals. On 
biological grounds, the AIDS RV. with its capacity to pro- 
duce dramatic cytopathic effect in tissue culture and 
slowly progressive disease in man. seems to have many 
features of a (antivirus. Although it is not associated with 
leukemia, the AIDS RV shares its target cell tropism 
(OKT4- human lymphocytes) and ability to form syncytia 
in these cells with HTLVj and -II. 

On the basis of electron microscopic morphology, the 
AIDS RV particle with its bar-shapeo central siructure 
most closely resembles visna and equine infectious ane- 
mia virus, both members of the lemivirus family. 

Nudeic acid hybridization and, more recently, nucleo- 
tide sequence analysis have been useful lor classifying 
microorganisms and establishing evolutionary relation- 
ships between closely related agents. Analysis of HTLVIll, 
ARV-2, and LAV nucleotide sequences dearly establishes 
that they are all retroviruses. Although the deduced amino 



MAR 19 '9? 11: 10 



33 1 40 61 30 1? 



PAGE 



479 



Ta&Le 1. Sequence Comparison of AIDS Retroviral /scales 









LTR 
















C ton OS or Isolates Compared 


U3 


R 


US 




po/ 


A 


env 


8 


Nucleotide 
homology 


MTLV-IH SHlO k HTtv-fll 8H8 


a/456 
(LB) 


NA 


NA 


NO 


NO 


NO 


36/2607 

(1.4) 


14/651 
(2.2) 




HTIV.III 8H10 x LAV 


UM56 
R.4) 


1/97 

0.0) 


0/85 
PI 


NO 


NO 


NO 


46/2607 
0«) 


u/651 
(2.2) 




LAV x ARV-2 


30MS6 
(6.6) 


1/97 
(1.0) 


1/85 
0.1) 


46/1S03 
0.1) 


87/3012 
(2.9) 


32/610 
(52) 


242/2607 
(9.3) 

190/1300* 
0«.6) 


51/651 
(7.8) 


Amino acid 
homology 


LAV x AftV-2 








17/501 
(3.-*) 


32/1004 
(3.2) 


19/203 
(9.4) 


131/869 
05.1) 


30/220 
(13.5) 



92/435° 

j (21,1) 

values are given as nucleotide dirferences/nucioondes compared or amino acid diderences/amino acioa compared. The percentage of naierogene- 
ity is incioded within the pe/ent*e$es. 

• Comparison ol (he 5* 1300 nucleotides ol ©nr. 

• Comparison o' the 5' 435 amino acids of en* 
NA: not applicable. NO: not done. 



acid sequences of the AJDS RV can be aligned with short 
regions of analogous coding segments present in other 
retroviruses, there are virtually no long stretches of poly- 
nucleotide sequence homology with other proviral DNAs. 
Thus, the AIDS RV genome is no more closely related to 
other mammalian retroviruses than it is to Rous sarcoma 
virus. The reported hybridization of HTLV-III with HTLV-I 
under low stringency conditions (Hahn et al.. Nature 372. 
166-169, 19W) is difficult to explain in view of the three 
published nucleotide sequences of the AIDS RV. Specific 
duplex structures formed during the reaction of denaiured 
HTLV-I or HTIV-II DNAs with AIDS RV DNA would be 
thermodynamically unstable even under low stringency 
hybridization conditions. In this regard, HTLV-I I, which has 
considerable nucleotide identity to HTLV-I, does not react 
with LAV proviral DNA (Alizon et al., Nature 3T2, 757-760, 
1984). 

The question, therefore, remains how the AIDS RV 
should be classified. A recent report by Gonda et al. 
(Science 227, 173-177, 1985) indicated that HTLV-lll se- 
quences located between 0.8 and 4.6 Kb (see map in Fig. 
i) hybridized to a "P-labeled nick-translated visna virus 
probe under low stringency conditions: Alizon et al. (op. 
cit,) delected no reactivity under similar relaxed condi- 
tions. Nonetheless, molecular structural arguments lor as- 
signing the AIDS RV to the tentivirus group ol mammalian 
reiroviruses are quite compelling. First, the large (9.7 ko) 
size of the AIOS RV provirus is quite similar to the 10- 
10.3 kb proviral DNA described for visna (Moiineaux and 
Clements. Gene 23. 137-148. 1963). Second, the obs 
associated with ihe visna provirus is also tPNA 1 * (K. 
Staskus, E. Retzel, A. Haase. and A. Paras, personal 
communication). Third, similar to the AIDS RV, visna con- 
tains a truncated gag gene thai encodes a p$5 gag pre- 
cursor protein and p30, pib\ and p14 processed gag pro- 
teins (Ouerat et al.. J. Virol. 52, 672-679, 1984). Fourth, the 
sizes of the visna env glycoproteins (gpi50 and gpi35) 
(Que rat et al.. J. Virol., op. cit.) are quiie similar to the 160 



ano 120 kd env gene products present in AIDS RV infected 
cells, Finally, visna and other (antiviruses, like the AIDS 
RV. are exogenous agents since labeled viral DNA probes 
fail to hybridize to preparations of uninfected host cell 
DNA. Coupled with the morphological properties and bio- 
logical characteristics of viral infections described above, 
the AIDS RVs would seem to be best classified within the 
lentivirus group of retroviruses. 
The Relationship of Different AIDS RV Isolates 
to One Another 

A very important issue in understanding the epidemiology 
and pathogenesis of AIDS, as well as in developing thera- 
pies and vaccine strategies, is the question of the hetero- 
geneity ol ihe virus. How variable is one isolate from an- 
other? Where do the variations occur and how will they 
affect viral pathogenicity and antigeniciiy? 

Two recent reports (Shaw et al.. Science 226. 1165- 
1171, 1984; Luciw et al.. Nature 372, 760-763, 1984) indi- 
cate that AIDS RVs isolated from different individuals 
exhibit striking structural heterogeneity as monitored by 
restriction enzyme polymorphisms. In one instance (Shaw 
et a)., op. cit.), 19 of 31 cleavage sites differed. Although 
the restriction maps of HTLV-III, ARV, and LAV proviral 
DNAs have not been formally compared, superficial in- 
spection of published cleavage maps and Southern blots 
suggests that HTlV-iii and LAV are closely related to one 
another whereas ARV and many other isolates are sub- 
stantially different. 

An analysis ol AIDS RV nucleotide sequence heteroge- 
neity is presented in Table i. As a baseline tor compari- 
sons, sequence variations oetween two HTLV-lll clones 
(8H10 and 6H8), reported by Ratner et al. (op. cit.). were 
determined. Small differences ranging from L4%-2.2% 
could be demonsirated in the U3, env, and "B" segments. 
A comparison ol HTLV-lll and LAV proviruses generated 
virtually identical results, indicating that HTLV-lll was no 
more different from LAV than molecular clones ol HTLV-III 
were from one another. In comrasi, striking differences 



MAR 19 *97 11: 11 



33 1 40 61 30 17 



PAGE. 15 



Cell 
460 



were apparent when LAV was compared to ARV These al- 
terations were primarily concentrated in the 3' half of the 
AIDS RV. The envgene In particular had undergone multi- 
ple changes, resulting in a 93% nucleotide sequence 
difference. Further analysis of the LAV and ARV env 
genes indicates: 1) a majority of the nucleotide alterations 
are located in ihe 5' half of the env coding sequences (and 
result in nearly 15% nucleotide sequence heterogeneity 
within this 1300 bp segment), and 2) the differences are 
primarily a series of reciprocal Insertions and deletions 
(up to 24 bp) in the 5' half of the env region of the two 
proviral ONAs, As shown in Table 1. the alterations in env 
nucleotide sequences in LAV relative to ARV are not 
trivial. They result in a 21°A difference in the deduced 
amino acid sequence of the two env glycoproteins. H is in- 
teresting to note that of the 31 and 32 potential glycosyla* 
tion sites present in the LAV and ARV env region, respec- 
tively, only 19 are identical. 

Although it is presently unclear why infection of man 
with the AIOS RV results in a slow progressive im- 
munosuppressive disease, some models of I en ti virus per- 
sistence are consistent with the structural differences ob- 
served between LAV/HTLV-lll and ARV. Neutralization 
studies indicate that the periodic nature of disease 
caused by equine infectious anemia virus (ElAV), a lenti vi- 
rus, is due to the sequential appearance, in an infected an* 



imal. ol novel antigenic viral variants that temporarily es- 
cape host immune surveillance. The different antigenic 
strains of ElAV responsible for sequential febrile episodes 
contain alterations confined to virion glycoproteins, as 
monitored by tryptic peptide mapping analyses (Mon- 
telaro et al., JBC 259. 10539-10544, 1984). Similar anti- 
genic variants have been reported for visna virus, with 
changes mapping to the env gene (Scott et al., Cell 7d. 
321-327, 1979; Clements et al.. PNAS 77 % 4454-4458. 
i960). However, the relationship of env glycoprotein anti- 
genic variations and viral pathogenesis is unclear, tn vivo, 
viral persistence may be due to a restriction of visna gene 
expression (Haase et at.. Science 195, 175-177, 1977). 

The analysis of nucleotide sequence heterogeneity 
presented in Table 1 indicates that HTLV-lll and LAV are 
virtually identical. This result is surprising In view of their 
independent isolation and published reports, cited above, 
which show that extensive restriction enzyme polymor- 
phisms exist among different AIDS RV isoiaies. To evalu- 
ate rhe significance of the differences in the env genes of 
ARV vs. HTLV-iu/LAv; another independent isolate must 
be sequenced. If (his verifies the substantial variability in 
env glycoproteins (reflecting genetic alterations attending 
AIDS RV infection in man), then preventive and therapeu- 
tic strategies may have to be directed to less variable 
regions of the viral genome. 



MAR 19 »97 11: 11 



33 1 40 61 30 17 



PAGE 



